5 research outputs found
USegScene: Unsupervised Learning of Depth, Optical Flow and Ego-Motion with Semantic Guidance and Coupled Networks
In this paper we propose USegScene, a framework for semantically guided
unsupervised learning of depth, optical flow and ego-motion estimation for
stereo camera images using convolutional neural networks. Our framework
leverages semantic information for improved regularization of depth and optical
flow maps, multimodal fusion and occlusion filling considering dynamic rigid
object motions as independent SE(3) transformations. Furthermore, complementary
to pure photo-metric matching, we propose matching of semantic features,
pixel-wise classes and object instance borders between the consecutive images.
In contrast to previous methods, we propose a network architecture that jointly
predicts all outputs using shared encoders and allows passing information
across the task-domains, e.g., the prediction of optical flow can benefit from
the prediction of the depth. Furthermore, we explicitly learn the depth and
optical flow occlusion maps inside the network, which are leveraged in order to
improve the predictions in therespective regions. We present results on the
popular KITTI dataset and show that our approach outperforms other methods by a
large margin
Learning Object Placements For Relational Instructions by Hallucinating Scene Representations
Robots coexisting with humans in their environment and performing services
for them need the ability to interact with them. One particular requirement for
such robots is that they are able to understand spatial relations and can place
objects in accordance with the spatial relations expressed by their user. In
this work, we present a convolutional neural network for estimating pixelwise
object placement probabilities for a set of spatial relations from a single
input image. During training, our network receives the learning signal by
classifying hallucinated high-level scene representations as an auxiliary task.
Unlike previous approaches, our method does not require ground truth data for
the pixelwise relational probabilities or 3D models of the objects, which
significantly expands the applicability in practical applications. Our results
obtained using real-world data and human-robot experiments demonstrate the
effectiveness of our method in reasoning about the best way to place objects to
reproduce a spatial relation. Videos of our experiments can be found at
https://youtu.be/zaZkHTWFMKMComment: Accepted at the 2020 IEEE International Conference on Robotics and
Automation (ICRA). Video at https://www.youtube.com/watch?v=zaZkHTWFMK
Long-Term Urban Vehicle Localization Using Pole Landmarks Extracted from 3-D Lidar Scans
Due to their ubiquity and long-term stability, pole-like objects are well
suited to serve as landmarks for vehicle localization in urban environments. In
this work, we present a complete mapping and long-term localization system
based on pole landmarks extracted from 3-D lidar data. Our approach features a
novel pole detector, a mapping module, and an online localization module, each
of which are described in detail, and for which we provide an open-source
implementation at www.github.com/acschaefer/polex. In extensive experiments, we
demonstrate that our method improves on the state of the art with respect to
long-term reliability and accuracy: First, we prove reliability by tasking the
system with localizing a mobile robot over the course of 15~months in an urban
area based on an initial map, confronting it with constantly varying routes,
differing weather conditions, seasonal changes, and construction sites. Second,
we show that the proposed approach clearly outperforms a recently published
method in terms of accuracy.Comment: 9 page